Overview

Brought to you by YData

Dataset statistics

Number of variables12
Number of observations2500
Missing cells0
Missing cells (%)0.0%
Duplicate rows1000
Duplicate rows (%)40.0%
Total size in memory470.4 KiB
Average record size in memory192.7 B

Variable types

Numeric8
Categorical3
Boolean1

Alerts

Dataset has 1000 (40.0%) duplicate rowsDuplicates
Pregnancies has 265 (10.6%) zeros Zeros

Reproduction

Analysis started2025-04-03 06:12:03.293279
Analysis finished2025-04-03 06:12:07.225860
Duration3.93 seconds
Software versionydata-profiling vv4.15.0
Download configurationconfig.json

Variables

Pregnancies
Real number (ℝ)

Zeros 

Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5244
Minimum0
Maximum9
Zeros265
Zeros (%)10.6%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:07.259007image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q37
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9286235
Coefficient of variation (CV)0.64729543
Kurtosis-1.2412125
Mean4.5244
Median Absolute Deviation (MAD)3
Skewness-0.0114272
Sum11311
Variance8.5768354
MonotonicityNot monotonic
2025-04-03T13:12:07.300998image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
9 292
11.7%
6 279
11.2%
1 276
11.0%
0 265
10.6%
3 251
10.0%
4 241
9.6%
5 238
9.5%
7 232
9.3%
8 225
9.0%
2 201
8.0%
ValueCountFrequency (%)
0 265
10.6%
1 276
11.0%
2 201
8.0%
3 251
10.0%
4 241
9.6%
5 238
9.5%
6 279
11.2%
7 232
9.3%
8 225
9.0%
9 292
11.7%
ValueCountFrequency (%)
9 292
11.7%
8 225
9.0%
7 232
9.3%
6 279
11.2%
5 238
9.5%
4 241
9.6%
3 251
10.0%
2 201
8.0%
1 276
11.0%
0 265
10.6%

Glucose
Real number (ℝ)

Distinct694
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean134.53924
Minimum70.5
Maximum199.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:07.373693image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum70.5
5-th percentile77.295
Q1101.8
median134.3
Q3166.8
95-th percentile193.5
Maximum199.9
Range129.4
Interquartile range (IQR)65

Descriptive statistics

Standard deviation37.482948
Coefficient of variation (CV)0.27860235
Kurtosis-1.2156746
Mean134.53924
Median Absolute Deviation (MAD)32.5
Skewness0.029814867
Sum336348.1
Variance1404.9714
MonotonicityNot monotonic
2025-04-03T13:12:07.464556image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
118.9 11
 
0.4%
178.7 11
 
0.4%
182.7 11
 
0.4%
169.1 10
 
0.4%
103.6 10
 
0.4%
151.7 10
 
0.4%
117.7 10
 
0.4%
123.4 10
 
0.4%
154 9
 
0.4%
83.9 9
 
0.4%
Other values (684) 2399
96.0%
ValueCountFrequency (%)
70.5 3
 
0.1%
70.9 2
 
0.1%
71 4
0.2%
71.1 2
 
0.1%
71.3 3
 
0.1%
71.7 3
 
0.1%
71.8 3
 
0.1%
72.2 8
0.3%
72.3 4
0.2%
72.4 2
 
0.1%
ValueCountFrequency (%)
199.9 4
0.2%
199.7 2
 
0.1%
199.5 2
 
0.1%
199.4 2
 
0.1%
198.9 4
0.2%
198.8 2
 
0.1%
198.6 2
 
0.1%
198.5 2
 
0.1%
198.4 7
0.3%
198.3 3
0.1%

BloodPressure
Real number (ℝ)

Distinct489
Distinct (%)19.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean89.23704
Minimum60
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:07.544190image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile63.1
Q174.1
median88.9
Q3104.5
95-th percentile116.6
Maximum120
Range60
Interquartile range (IQR)30.4

Descriptive statistics

Standard deviation17.288059
Coefficient of variation (CV)0.19373188
Kurtosis-1.1950889
Mean89.23704
Median Absolute Deviation (MAD)15
Skewness0.086272563
Sum223092.6
Variance298.87699
MonotonicityNot monotonic
2025-04-03T13:12:07.619573image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
79.2 18
 
0.7%
78.2 17
 
0.7%
71.5 17
 
0.7%
64.7 16
 
0.6%
90.2 15
 
0.6%
71.6 15
 
0.6%
77.2 15
 
0.6%
94.3 14
 
0.6%
83 14
 
0.6%
75.9 14
 
0.6%
Other values (479) 2345
93.8%
ValueCountFrequency (%)
60 7
0.3%
60.1 2
 
0.1%
60.2 6
0.2%
60.4 6
0.2%
60.5 6
0.2%
60.6 6
0.2%
60.8 2
 
0.1%
61 2
 
0.1%
61.1 2
 
0.1%
61.2 10
0.4%
ValueCountFrequency (%)
120 2
 
0.1%
119.9 2
 
0.1%
119.8 2
 
0.1%
119.7 2
 
0.1%
119.6 2
 
0.1%
119.5 2
 
0.1%
119.4 5
0.2%
119.2 3
0.1%
119.1 6
0.2%
118.8 3
0.1%

SkinThickness
Real number (ℝ)

Distinct373
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.19868
Minimum10
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:07.679868image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11.6
Q119.4
median28.7
Q338.6
95-th percentile47.5
Maximum50
Range40
Interquartile range (IQR)19.2

Descriptive statistics

Standard deviation11.335596
Coefficient of variation (CV)0.3882229
Kurtosis-1.1396867
Mean29.19868
Median Absolute Deviation (MAD)9.6
Skewness0.060871458
Sum72996.7
Variance128.49574
MonotonicityNot monotonic
2025-04-03T13:12:07.742814image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.6 19
 
0.8%
20.3 19
 
0.8%
34.2 18
 
0.7%
42.1 18
 
0.7%
25.9 17
 
0.7%
18.4 16
 
0.6%
37.7 16
 
0.6%
11.2 15
 
0.6%
17.9 15
 
0.6%
39.2 15
 
0.6%
Other values (363) 2332
93.3%
ValueCountFrequency (%)
10 7
0.3%
10.1 11
0.4%
10.2 2
 
0.1%
10.3 6
0.2%
10.4 4
 
0.2%
10.5 13
0.5%
10.6 9
0.4%
10.7 3
 
0.1%
10.8 2
 
0.1%
10.9 10
0.4%
ValueCountFrequency (%)
50 3
 
0.1%
49.9 6
0.2%
49.8 5
0.2%
49.7 2
 
0.1%
49.6 6
0.2%
49.5 4
0.2%
49.4 4
0.2%
49.3 4
0.2%
49.2 8
0.3%
49.1 3
 
0.1%

Insulin
Real number (ℝ)

Distinct850
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean157.30172
Minimum15
Maximum299.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:07.801714image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile31.595
Q192.05
median158.4
Q3222.85
95-th percentile283.7
Maximum299.6
Range284.6
Interquartile range (IQR)130.8

Descriptive statistics

Standard deviation79.652502
Coefficient of variation (CV)0.50636765
Kurtosis-1.1179238
Mean157.30172
Median Absolute Deviation (MAD)65.7
Skewness0.0055982029
Sum393254.3
Variance6344.521
MonotonicityNot monotonic
2025-04-03T13:12:07.857665image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
86.9 10
 
0.4%
211.6 9
 
0.4%
223.7 9
 
0.4%
92.7 8
 
0.3%
179.8 8
 
0.3%
141.2 8
 
0.3%
78.1 8
 
0.3%
38.9 8
 
0.3%
293.2 7
 
0.3%
77.7 7
 
0.3%
Other values (840) 2418
96.7%
ValueCountFrequency (%)
15 2
 
0.1%
15.1 3
0.1%
15.3 3
0.1%
15.9 2
 
0.1%
16.2 2
 
0.1%
16.5 2
 
0.1%
16.6 2
 
0.1%
17.3 2
 
0.1%
17.6 6
0.2%
17.7 3
0.1%
ValueCountFrequency (%)
299.6 5
0.2%
299.1 2
 
0.1%
298.9 2
 
0.1%
298.7 4
0.2%
298.4 2
 
0.1%
298.1 2
 
0.1%
297.9 3
0.1%
297.7 2
 
0.1%
296.1 3
0.1%
296 2
 
0.1%

BMI
Real number (ℝ)

Distinct268
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.3718
Minimum18
Maximum45
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:07.916030image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile19.1
Q124.6
median31.3
Q338.1
95-th percentile43.5
Maximum45
Range27
Interquartile range (IQR)13.5

Descriptive statistics

Standard deviation7.7772033
Coefficient of variation (CV)0.24790427
Kurtosis-1.186453
Mean31.3718
Median Absolute Deviation (MAD)6.8
Skewness0.012422574
Sum78429.5
Variance60.484891
MonotonicityNot monotonic
2025-04-03T13:12:08.048220image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42.1 25
 
1.0%
43.3 23
 
0.9%
30.7 22
 
0.9%
20.9 21
 
0.8%
31.6 21
 
0.8%
31.3 21
 
0.8%
18.5 20
 
0.8%
37.4 19
 
0.8%
39.1 18
 
0.7%
33.1 18
 
0.7%
Other values (258) 2292
91.7%
ValueCountFrequency (%)
18 5
 
0.2%
18.1 14
0.6%
18.2 12
0.5%
18.3 7
 
0.3%
18.4 11
0.4%
18.5 20
0.8%
18.6 8
 
0.3%
18.8 13
0.5%
18.9 18
0.7%
19 8
 
0.3%
ValueCountFrequency (%)
45 2
 
0.1%
44.9 13
0.5%
44.8 9
0.4%
44.7 2
 
0.1%
44.6 12
0.5%
44.5 9
0.4%
44.4 16
0.6%
44.3 11
0.4%
44.2 5
 
0.2%
44.1 10
0.4%

DiabetesPedigreeFunction
Real number (ℝ)

Distinct231
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.324988
Minimum0.11
Maximum2.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:08.106089image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.11
5-th percentile0.21
Q10.75
median1.32
Q31.92
95-th percentile2.36
Maximum2.5
Range2.39
Interquartile range (IQR)1.17

Descriptive statistics

Standard deviation0.68695601
Coefficient of variation (CV)0.51846206
Kurtosis-1.1881789
Mean1.324988
Median Absolute Deviation (MAD)0.59
Skewness-0.053062855
Sum3312.47
Variance0.47190856
MonotonicityNot monotonic
2025-04-03T13:12:08.163890image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.2 29
 
1.2%
1.14 27
 
1.1%
2.39 23
 
0.9%
0.97 23
 
0.9%
2.15 23
 
0.9%
2.1 22
 
0.9%
1.36 22
 
0.9%
1.96 22
 
0.9%
0.65 21
 
0.8%
0.99 21
 
0.8%
Other values (221) 2267
90.7%
ValueCountFrequency (%)
0.11 12
0.5%
0.12 10
0.4%
0.13 11
0.4%
0.14 9
0.4%
0.15 11
0.4%
0.16 18
0.7%
0.17 9
0.4%
0.18 5
 
0.2%
0.19 14
0.6%
0.2 11
0.4%
ValueCountFrequency (%)
2.5 4
 
0.2%
2.49 2
 
0.1%
2.48 8
0.3%
2.47 10
0.4%
2.46 5
0.2%
2.45 3
 
0.1%
2.44 8
0.3%
2.43 8
0.3%
2.42 8
0.3%
2.41 2
 
0.1%

Age
Real number (ℝ)

Distinct59
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.272
Minimum21
Maximum79
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2025-04-03T13:12:08.221819image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile24
Q136.75
median50
Q365
95-th percentile76
Maximum79
Range58
Interquartile range (IQR)28.25

Descriptive statistics

Standard deviation16.638893
Coefficient of variation (CV)0.33097734
Kurtosis-1.1339881
Mean50.272
Median Absolute Deviation (MAD)14
Skewness0.00064820544
Sum125680
Variance276.85276
MonotonicityNot monotonic
2025-04-03T13:12:08.280467image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 69
 
2.8%
59 62
 
2.5%
47 61
 
2.4%
68 60
 
2.4%
72 60
 
2.4%
78 59
 
2.4%
43 58
 
2.3%
33 57
 
2.3%
51 57
 
2.3%
49 57
 
2.3%
Other values (49) 1900
76.0%
ValueCountFrequency (%)
21 41
1.6%
22 27
 
1.1%
23 27
 
1.1%
24 69
2.8%
25 31
1.2%
26 34
1.4%
27 38
1.5%
28 34
1.4%
29 48
1.9%
30 50
2.0%
ValueCountFrequency (%)
79 31
1.2%
78 59
2.4%
77 33
1.3%
76 41
1.6%
75 46
1.8%
74 32
1.3%
73 51
2.0%
72 60
2.4%
71 25
1.0%
70 33
1.3%

BMI_Category
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size139.5 KiB
Normal
657 
Underweight
642 
Overweight
640 
Obese
561 

Length

Max length11
Median length10
Mean length8.0836
Min length5

Characters and Unicode

Total characters20209
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOverweight
2nd rowUnderweight
3rd rowOverweight
4th rowUnderweight
5th rowOverweight

Common Values

ValueCountFrequency (%)
Normal 657
26.3%
Underweight 642
25.7%
Overweight 640
25.6%
Obese 561
22.4%

Length

2025-04-03T13:12:08.335563image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-03T13:12:08.379900image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
normal 657
26.3%
underweight 642
25.7%
overweight 640
25.6%
obese 561
22.4%

Most occurring characters

ValueCountFrequency (%)
e 3686
18.2%
r 1939
 
9.6%
w 1282
 
6.3%
t 1282
 
6.3%
h 1282
 
6.3%
g 1282
 
6.3%
i 1282
 
6.3%
O 1201
 
5.9%
N 657
 
3.3%
o 657
 
3.3%
Other values (9) 5659
28.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 20209
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3686
18.2%
r 1939
 
9.6%
w 1282
 
6.3%
t 1282
 
6.3%
h 1282
 
6.3%
g 1282
 
6.3%
i 1282
 
6.3%
O 1201
 
5.9%
N 657
 
3.3%
o 657
 
3.3%
Other values (9) 5659
28.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 20209
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3686
18.2%
r 1939
 
9.6%
w 1282
 
6.3%
t 1282
 
6.3%
h 1282
 
6.3%
g 1282
 
6.3%
i 1282
 
6.3%
O 1201
 
5.9%
N 657
 
3.3%
o 657
 
3.3%
Other values (9) 5659
28.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 20209
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3686
18.2%
r 1939
 
9.6%
w 1282
 
6.3%
t 1282
 
6.3%
h 1282
 
6.3%
g 1282
 
6.3%
i 1282
 
6.3%
O 1201
 
5.9%
N 657
 
3.3%
o 657
 
3.3%
Other values (9) 5659
28.0%

Smoker
Boolean

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
True
1296 
False
1204 
ValueCountFrequency (%)
True 1296
51.8%
False 1204
48.2%
2025-04-03T13:12:08.414388image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Outcome
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size122.2 KiB
1
1310 
0
1190 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2500
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 1310
52.4%
0 1190
47.6%

Length

2025-04-03T13:12:08.453383image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-03T13:12:08.484522image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1310
52.4%
0 1190
47.6%

Most occurring characters

ValueCountFrequency (%)
1 1310
52.4%
0 1190
47.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2500
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1310
52.4%
0 1190
47.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2500
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1310
52.4%
0 1190
47.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2500
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1310
52.4%
0 1190
47.6%

Notes
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size152.8 KiB
Severe symptoms
646 
Under treatment
629 
Mild symptoms
626 
No symptoms
599 

Length

Max length15
Median length15
Mean length13.5408
Min length11

Characters and Unicode

Total characters33852
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo symptoms
2nd rowSevere symptoms
3rd rowMild symptoms
4th rowNo symptoms
5th rowMild symptoms

Common Values

ValueCountFrequency (%)
Severe symptoms 646
25.8%
Under treatment 629
25.2%
Mild symptoms 626
25.0%
No symptoms 599
24.0%

Length

2025-04-03T13:12:08.527032image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-03T13:12:08.565885image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
symptoms 1871
37.4%
severe 646
 
12.9%
under 629
 
12.6%
treatment 629
 
12.6%
mild 626
 
12.5%
no 599
 
12.0%

Most occurring characters

ValueCountFrequency (%)
m 4371
12.9%
e 3825
11.3%
t 3758
11.1%
s 3742
11.1%
2500
 
7.4%
o 2470
 
7.3%
r 1904
 
5.6%
y 1871
 
5.5%
p 1871
 
5.5%
n 1258
 
3.7%
Other values (9) 6282
18.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 33852
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
m 4371
12.9%
e 3825
11.3%
t 3758
11.1%
s 3742
11.1%
2500
 
7.4%
o 2470
 
7.3%
r 1904
 
5.6%
y 1871
 
5.5%
p 1871
 
5.5%
n 1258
 
3.7%
Other values (9) 6282
18.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 33852
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
m 4371
12.9%
e 3825
11.3%
t 3758
11.1%
s 3742
11.1%
2500
 
7.4%
o 2470
 
7.3%
r 1904
 
5.6%
y 1871
 
5.5%
p 1871
 
5.5%
n 1258
 
3.7%
Other values (9) 6282
18.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 33852
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
m 4371
12.9%
e 3825
11.3%
t 3758
11.1%
s 3742
11.1%
2500
 
7.4%
o 2470
 
7.3%
r 1904
 
5.6%
y 1871
 
5.5%
p 1871
 
5.5%
n 1258
 
3.7%
Other values (9) 6282
18.6%

Interactions

2025-04-03T13:12:06.479183image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.523649image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.880619image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.253095image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.704776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.120533image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.571358image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.015978image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.536817image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.568644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.923139image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.296959image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.755321image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.167876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.622360image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.079711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.704935image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.612128image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.970500image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.346294image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.805733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.219702image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.674914image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.141450image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.768065image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.658021image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.018206image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.392293image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.854966image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.268598image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.739706image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.204455image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.820534image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.704283image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.067519image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.443470image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.905233image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.322776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.809149image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.270135image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.870774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.748811image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.111975image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.551392image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.957736image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.385694image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.870864image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.328075image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.927690image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.792810image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.157166image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.599706image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.011640image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.448390image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.920897image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.374106image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.990160image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:03.834563image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.206856image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:04.653202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.065711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.505119image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:05.969502image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-03T13:12:06.417587image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-04-03T13:12:08.609309image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
AgeBMIBMI_CategoryBloodPressureDiabetesPedigreeFunctionGlucoseInsulinNotesOutcomePregnanciesSkinThicknessSmoker
Age1.0000.0360.0820.015-0.0440.046-0.0410.0560.063-0.0140.0080.056
BMI0.0361.0000.084-0.017-0.0170.0040.0060.0930.0730.047-0.0210.048
BMI_Category0.0820.0841.0000.0690.1060.0690.0980.0840.0300.0840.0800.000
BloodPressure0.015-0.0170.0691.0000.072-0.0190.0290.0690.1060.0370.0550.050
DiabetesPedigreeFunction-0.044-0.0170.1060.0721.000-0.037-0.0420.0960.087-0.0530.0030.037
Glucose0.0460.0040.069-0.019-0.0371.0000.0030.0610.040-0.009-0.0050.010
Insulin-0.0410.0060.0980.029-0.0420.0031.0000.0800.1250.028-0.0430.032
Notes0.0560.0930.0840.0690.0960.0610.0801.0000.0480.0780.0770.000
Outcome0.0630.0730.0300.1060.0870.0400.1250.0481.0000.0670.0690.000
Pregnancies-0.0140.0470.0840.037-0.053-0.0090.0280.0780.0671.000-0.0090.053
SkinThickness0.008-0.0210.0800.0550.003-0.005-0.0430.0770.069-0.0091.0000.060
Smoker0.0560.0480.0000.0500.0370.0100.0320.0000.0000.0530.0601.000

Missing values

2025-04-03T13:12:07.085160image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-03T13:12:07.173066image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeBMI_CategorySmokerOutcomeNotes
05107.7110.531.5207.030.11.8549OverweightFalse0No symptoms
11101.0119.827.547.834.92.3930UnderweightTrue1Severe symptoms
2397.885.820.7213.022.02.1052OverweightFalse1Mild symptoms
36123.479.234.2140.823.12.2743UnderweightFalse0No symptoms
47161.5114.444.9284.341.22.0559OverweightFalse1Mild symptoms
5298.371.550.055.424.91.2029ObeseFalse1Severe symptoms
68117.796.934.3198.228.30.7057NormalFalse1No symptoms
71120.884.326.2123.941.50.9972UnderweightFalse0No symptoms
87196.962.519.0287.735.90.1560NormalFalse0Severe symptoms
97194.180.820.3109.027.21.3634ObeseTrue1No symptoms
PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeBMI_CategorySmokerOutcomeNotes
24909197.671.536.399.530.71.7262UnderweightTrue1Mild symptoms
24910139.9106.339.3198.543.71.2041NormalTrue1Severe symptoms
24920188.581.122.539.826.00.2755UnderweightFalse0Mild symptoms
2493286.476.835.615.330.32.2447ObeseFalse0Severe symptoms
2494281.481.711.3224.821.01.4851UnderweightFalse0Under treatment
2495297.593.033.3205.418.50.9328NormalFalse1Severe symptoms
2496577.375.727.492.231.51.2727ObeseFalse1Severe symptoms
24974182.771.636.242.019.61.1860UnderweightTrue0Severe symptoms
24989113.489.923.3158.938.30.9069OverweightTrue1Mild symptoms
24992117.877.923.5223.624.40.4449UnderweightFalse1No symptoms

Duplicate rows

Most frequently occurring

PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeBMI_CategorySmokerOutcomeNotes# duplicates
2492101.862.225.945.220.80.9470OverweightTrue0Severe symptoms6
240109.284.948.2255.931.50.1965NormalFalse1Under treatment5
990195.583.927.255.118.42.0775NormalTrue1Severe symptoms5
228279.560.020.3179.835.42.2052NormalTrue1No symptoms5
229281.481.711.3224.821.01.4851UnderweightFalse0Under treatment5
3403118.5116.321.8256.332.61.1352ObeseFalse0Severe symptoms5
3803174.075.929.2289.236.51.3124OverweightFalse0Under treatment5
4304107.692.830.6137.430.01.8859UnderweightTrue0Mild symptoms5
4844182.771.636.242.019.61.1860UnderweightTrue0Severe symptoms5
6426122.368.126.3141.222.61.0435OverweightFalse0No symptoms5